An Annotated Bangla Sentiment Analysis Corpus
Fuad Rahman, Aminul Islam, Habibur Khan, Zakir Hossain, Mahfuza Begum, Sadia Mahanaz, Ashraful Islam
27-28th September, 2019 - 2nd International Conference on Bangla Speech and Language Processing (ICBSLP) 2019
Description
This paper presents a Bangla corpus specifically targeted for sentiment analysis and made available to researchers under an open-source licensing scheme1. We have collected and manually annotated over 10,000 sentences with sentiment polarity. We then moved to the Word domain and annotated over 15,000 words derived from these sentences with sentiment polarity. Each entry is the corpus has been cross-annotated by at least two and sometimes three annotators for ensuring quality. Also as a pre-requisite of creating a high quality sentiment analysis corpus, we had to build a secondary corpus for Bangla word stemming, which is also been cross-validated by at least two and sometimes three annotators for ensuring quality.
View Publication